Distributional Identification of Non-Referential Pronouns

نویسندگان

  • Shane Bergsma
  • Dekang Lin
  • Randy Goebel
چکیده

We present an automatic approach to determining whether a pronoun in text refers to a preceding noun phrase or is instead nonreferential. We extract the surrounding textual context of the pronoun and gather, from a large corpus, the distribution of words that occur within that context. We learn to reliably classify these distributions as representing either referential or non-referential pronoun instances. Despite its simplicity, experimental results on classifying the English pronoun it show the system achieves the highest performance yet attained on this important task.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Instance Sampling for Identification of Arabic Pleonastic Pronouns M. Abdul-Mageed 1 Instance Sampling for Automatic Identification of Arabic Pleonastic Pronouns

The term anaphora describes backward reference to items previously occurring in a text (see e.g., Mitkov, 2002). The pointing back item is called an anaphor and the item to which it refers is called its antecedent. The identification of an anaphor’s antecedent is termed anaphora resolution and is considered one of the most difficult tasks in natural language processing (NLP) since it relies on ...

متن کامل

Pronouns Without Explicit Antecedents: How do We Know When a Pronoun is Referential?

Pronouns without explicit noun phrase antecedents pose a problem for any theory of reference resolution. We report here on an empirical study of such pronouns in the Santa Barbara Corpus of Spoken American English, a corpus of spontaneous, casual conversation. Analysis of 2,046 third person personal pronouns in fourteen transcripts indicates that 330 (or 16.1%) lack NP antecedents. These pronou...

متن کامل

Pronoun Interpretation in the Second Language: Effects of Computational Complexity

Children acquiring their native language (L1) have been reported to have greater difficulty in interpreting pronouns than reflexives. In addition, they are less accurate when pronouns refer to referential antecedents than to quantified antecedents, and when they hear full pronouns as opposed to reduced pronouns. We hypothesize that similar difficulties of interpretation will occur for (non-adva...

متن کامل

Referential, Quasi, and Expletive Subjects in L2 English of Persian Speakers

The present study sought to investigate the acquisition of referential, quasi and expletive subject pronouns, three different types of obligatory subjects in English, by adult Persian speaking L2 learners of English at different stages of L2 acquisition. A Grammaticality Judgment Test and a Translation Test were designed and developed to elicit the participants' knowledge of obligatory subjects...

متن کامل

A machine learning method for identifying impersonal constructions and zero pronouns in Spanish∗ Un método de aprendizaje automático para la identificación de construcciones impersonales y pronombres cero en español

In this paper, we present a machine learning system for classifying subject ellipsis in Spanish as either referential or non-referential. To the best of our knowledge, this is the first attempt to automatically identify non-referential ellipsis in Spanish. An evaluation of our system against 6,827 finite verbs shows an accuracy of 87%.

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008